18 research outputs found

    Brief Announcement: Update Consistency in Partitionable Systems

    Get PDF
    Data replication is essential to ensure reliability, availability and fault-tolerance of massive distributed applications over large scale systems such as the Internet. However, these systems are prone to partitioning, which by Brewer's CAP theorem [1] makes it impossible to use a strong consistency criterion like atomicity. Eventual consistency [2] guaranties that all replicas eventually converge to a common state when the participants stop updating. However, it fails to fully specify shared objects and requires additional non-intuitive and error-prone distributed specification techniques, that must take into account all possible concurrent histories of updates to specify this common state [3]. This approach, that can lead to specifications as complicated as the implementations themselves, is limited by a more serious issue. The concurrent specification of objects uses the notion of concurrent events. In message-passing systems, two events are concurrent if they are enforced by different processes and each process enforced its event before it received the notification message from the other process. In other words, the notion of concurrency depends on the implementation of the object, not on its specification. Consequently, the final user may not know if two events are concurrent without explicitly tracking the messages exchanged by the processes. A specification should be independent of the system on which it is implemented. We believe that an object should be totally specified by two facets: its abstract data type, that characterizes its sequential executions, and a consistency criterion, that defines how it is supposed to behave in a distributed environment. Not only sequential specification helps repeal the problem of intention, it also allows to use the well studied and understood notions of languages and automata. This makes possible to apply all the tools developed for sequential systems, from their simple definition using structures and classes to the most advanced techniques like model checking and formal verification. Eventual consistency (EC) imposes no constraint on the convergent state, that very few depends on the sequential specification. For example, an implementation that ignores all the updates is eventually consistent, as all replicas converge to the initial state. We propose a new consistency criterion, update consistency (UC), in which the convergent state must be obtained by a total ordering of the updates, that contains the sequential order of eachComment: in DISC14 - 28th International Symposium on Distributed Computing, Oct 2014, Austin, United State

    Signature-Free Asynchronous Binary Byzantine Consensus with t<<n/3, O(nÂČ) Messages, and O(1) Expected Time

    Get PDF
    International audienceThis paper is on broadcast and agreement in asynchronous message-passing systems made up of n processes, and where up to t processes may have a Byzantine Behavior. Its first contribution is a powerful , yet simple, all-to-all broadcast communication abstraction suited to binary values. This abstraction, which copes with up to t < n/3 Byzantine processes, allows each process to broadcast a binary value, and obtain a set of values such that (1) no value broadcast only by Byzantine processes can belong to the set of a correct process, and (2) if the set obtained by a correct process contains a single value v, then the set obtained by any correct process contains v. The second contribution of the paper is a new round-based asynchronous consensus algorithm that copes with up to t < n/3 Byzantine processes. This algorithm is based on the previous binary broadcast abstraction and a weak common coin. In addition of being signature-free and optimal with respect to the value of t, this consensus algorithm has several noteworthy properties: the expected number of rounds to decide is constant; each round is composed of a constant number of communication steps and involves O(nÂČ) messages; each message is composed of a round number plus a constant number of bits. Moreover , the algorithm tolerates message reordering by the adversary (i.e., the Byzantine processes)

    Asynchronous Byzantine Systems: From Multivalued to Binary Consensus with t < n/3, O(nÂČ) Messages, O(1) Time, and no Signature

    Get PDF
    International audienceThis paper presents a new algorithm that reduces multivalued consensus to binary consensus in an asyn-chronous message-passing system made up of n processes where up to t may commit Byzantine failures. This algorithm has the following noteworthy properties: it assumes t < n/3 (and is consequently optimal from a resilience point of view), uses O(nÂČ) messages, has a constant time complexity, and does not use signatures. The design of this reduction algorithm relies on two new all-to-all communication abstractions. The first one allows the non-faulty processes to reduce the number of proposed values to c, where c is a small constant. The second communication abstraction allows each non-faulty process to compute a set of (proposed) values such that, if the set of a non-faulty process contains a single value, then this value belongs to the set of any non-faulty process. Both communication abstractions have an O(nÂČ) message complexity and a constant time complexity. The reduction of multivalued Byzantine consensus to binary Byzantine consensus is then a simple sequential use of these communication abstractions. To the best of our knowledge, this is the first asynchronous message-passing algorithm that reduces multivalued consensus to binary consensus with O(nÂČ) messages and constant time complexity (measured with the longest causal chain of messages) in the presence of up to t < n/3 Byzantine processes, and without using cryptography techniques. Moreover, this reduction algorithm tolerates message reordering by Byzantine processes

    Dumbo-MVBA: Optimal Multi-valued Validated Asynchronous Byzantine Agreement, Revisited

    Get PDF
    Multi-valued validated asynchronous Byzantine agreement (MVBA), proposed in the elegant work of Cachin et al. (CRYPTO \u2701), is fundamental for critical fault-tolerant services such as atomic broadcast in the asynchronous network. It was left as an open problem to asymptotically reduce the O(ln2+n2∗lambda+n3)O(ln^2+n^2*lambda+n^3) communication (where nn is the number of parties, ll is the input length, and lambdalambda is the security parameter). Recently, Abraham et al. (PODC \u2719) removed the n3n^3 term to partially answer the question when input is small. However, in other typical cases, e.g., building atomic broadcast through MVBA, the input length l>=n∗lambdal >= n*lambda, and thus the communication is dominated by the ln2ln^2 term and the problem raised by Cachin et al. remains open. We fill the gap and answer the remaining part of the above open problem. In particular, we present two MVBA protocols with O(ln+n2∗lambda)O(ln+n^2*lambda) communicated bits, which is optimal when l>=n∗lambdal >= n*lambda. We also maintain other benefits including optimal resilience to tolerate up to n/3n/3 adaptive Byzantine corruptions, optimal expected constant running time, and optimal O(n2)O(n^2) messages. At the core of our design, we propose asynchronous provable dispersal broadcast (APDB) in which each input can be split and dispersed to every party and later recovered in an efficient way. Leveraging APDB and asynchronous binary agreement, we design an optimal MVBA protocol, Dumbo-MVBA; we also present a general self-bootstrap framework Dumbo-MVBA* to reduce the communication of any existing MVBA protocols

    Dumbo: Faster Asynchronous BFT Protocols

    Get PDF
    HoneyBadgerBFT, proposed by Miller et al. [32] as the first practical asynchronous atomic broadcast protocol, demonstrated impressive performance. The core of HoneyBadgerBFT (HB-BFT) is to achieve batching consensus using asynchronous common subset protocol (ACS) of Ben-Or et al., constituted with nn reliable broadcast protocol (RBC) to have each node propose its input, followed by nn asynchronous binary agreement protocol (ABA) to make a decision for each proposed value (nn is the total number of nodes). In this paper, we propose two new atomic broadcast protocols (called Dumbo1, Dumbo2) both of which have asymptotically and practically better efficiency. In particular, the ACS of Dumbo1 only runs a small kk (independent of nn) instances of ABA, while that of Dumbo2 further reduces it to constant! At the core of our techniques are two major observations: (1) reducing the number of ABA instances significantly improves efficiency; and (2) using multi-valued validated Byzantine agreement (MVBA) which was considered sub-optimal for ACS in [32] in a more careful way could actually lead to a much more efficient ACS. We implement both Dumbo1, Dumbo2 and deploy them as well as HB-BFT on 100 Amazon EC2 t2.medium instances uniformly distributed throughout 10 different regions across the globe, and run extensive experiments in the same environments. The experimental results show that our protocols achieve multi-fold improvements over HoneyBadgerBFT on both latency and throughput, especially when the system scale becomes moderately large

    Intrusion-Tolerant Broadcast and Agreement Abstractions in the Presence of Byzantine Processes

    Get PDF
    International audienceA process commits a Byzantine failure when its behavior does not comply with the algorithm it is assumed to execute. Considering asynchronous message-passing systems, this paper presents distributed abstractions, and associated algorithms, that allow non-faulty processes to correctly cooperate, despite the uncertainty created by the net effect of asynchrony and Byzantine failures. These abstractions are broadcast abstractions (namely, no-duplicity broadcast, reliable broadcast, and validated broadcast), and agreement abstraction (namely, consensus). While no-duplicity broadcast and reliable broadcast are well-known one-to-all communication abstractions , validated broadcast is a new all-to-all communication abstraction designed to address agreement problems. After having introduced these abstractions, the paper presents an algorithm implementing validated broadcast on top of reliable broadcast. Then the paper presents two consensus algorithms, which are reductions of multivalued consensus to binary consensus. The first one is a generic algorithm, that can be instantiated with unreliable broadcast or no-duplicity broadcast, while the second is a consensus algorithm based on validated broadcast. Finally, a third algorithm is presented that solves the binary consensus problem. This algorithm is a randomized algorithm based on validated broadcast and a common coin. The presentation of all the abstractions and their algorithms is done incrementally

    Extension de la Hiérarchie de Herlihy aux SystÚmes Multi-threads

    No full text
    International audienceDans les systĂšmes d'exploitation et les langages de programmation modernes adaptĂ©s aux architectures informatiques multicoeurs, le parallĂ©lisme est abstrait par la notion de threads d'exĂ©cution. Les systĂšmes multi-threads ont deux spĂ©cificitĂ©s majeures : 1) de nouveaux threads peuventĂȘtre crĂ©Ă©s dynamiquement pendant l'exĂ©cution, il n'y a donc pas de limite sur le nombre total de threads participantĂ  une exĂ©cution continue ; 2) les threads ont accĂšsĂ  un mĂ©canisme d'allocation de mĂ©moire qui ne peut pas allouer des tableaux infinis. Cela rend difficile d'adapter certains algorithmes a des systĂšmes multi-threads, en particulier ceux qui attribuent un registre partagĂ©Ă  chaque processus. Cet article, qui est une traduction rĂ©sumĂ©e et vulgarisĂ©e de prĂ©cĂ©dents travaux [PMB20], explore la puissance de coordination des objets partagĂ©s dans les systĂšmes multi-threads enĂ©tendant la hiĂ©rarchie de Herlihy pour tenir compte de ces contraintes. Il propose de subdiviser en cinq nouveaux degrĂ©s l'ensemble des objets de numĂ©ro de consensus infini, selon leur capacitĂ©Ă  synchroniser un nombre bornĂ©, fini ou infini de processus, avec ou sans la nĂ©cessitĂ© d'allouer un tableau infini. Il prĂ©sente ensuite un objet illustrant chaque degrĂ© proposĂ©
    corecore